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Abstract 

This note investigates a number of scenarios in which unadjusted 
testing following a blinded sample size re-estimation leads to type I 
error violations. For superiority testing, this occurs in certain small- 
sample borderline cases. We discuss a number of alternative ap- 
proaches that keep the type I error rate. The paper also gives a reason 
why the type I error inflation in the superiority context might have 
been missed in previous publications and investigates why it is more 
marked in case of non-inferiority testing. 

1 Introduction 

Sample Size re-estimation (SSR) in clinical trials has a long history that dates 
back to Stein (1945). A sample size review at an interim analysis aims at 
correcting assumptions which were made at the planning stage of the trial, 
but turn out to be unrealistic. When the sample units are considered to 
be normally distributed, this typically concerns the initial assumption about 
the variation of responses. Wittes and Brittain (1990) and Gould and Shih 
(1992, 1998) among others discussed methods of blinded SSR. In contrast 
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to unblinded SSR, blinded SSR assumes that the actually realized effect size 
estimate is not disclosed to the decision makers who do the SSR. Wittes et 
al. (1999) and Zucker et al. (1999) investigated the performance of various 
blinded and unblinded SSR methods by simulation. They observed some 
slight type I error violations in cases with small sample size and gave expla- 
nations for this phenomenon for some of the unblinded approaches available 
at that time. 

Slightly later, Kieser and Friede([I], [2]) suggested a method of blinded 
sample size review which is particularly easy to implement. In a trial with 
normally distributed sample units with the aim of testing for a significant 
treatment effect ("superiority testing") at the final analysis, it estimates the 
variance under the null hypothesis of no treatment effect and then proceeds 
to an unmodified t-test in the final analysis, i.e. a test that ignores the fact 
that the final sample size was not fixed from the onset of the trial. Kieser and 
Friede investigated the type I error control of their suggestion by simulation. 
They conclude that no additional measures to control the significance level 
are required in these designs if the study is evaluated with the common t-test 
and the sample size is recalculated with any of these simple blind variance 
estimators. 

Although Kieser and Friede explicitly stated that they provide no formal 
proof of type I error control, it seems to us that many statisticians in the 
pharmaceutical industry are under the impression that such a proof is avail- 
able. This, however, is not the case. In this paper, we show that in certain 
situations, the method suggested by Kieser and Friede does not control the 
type I error. 
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It should be emphasized that asymptotic type I error control with blinded 
SSR is guaranteed. If the sample size of only one of the two stages tends 
to infinity, the other stage is obviously irrelevant for the asymptotic value 
of the final test statistic and thus the method asymptotically keeps a. If 
the sample size in both stages goes to infinity, then the stage- 1-estimate of 
the variance converges to a constant value. Hence, whatever sample size re- 
estimation rule is used, it implicitly fixes the total sample size in advance 
(though its precise value is not yet known before the interim). In any case, 
asymptotically a is again kept. Govindarajulu (2003) has formalized this 
thought and extended to non-normally distributed data. As a consequence, 
the type I error violations discussed in this note are very small and occur 
in cases with small samples. We still believe, however, that the statistical 
community should be made aware of these limitations of blinded sample-size 
review methodology. 

While sections |2H focus on the common case of testing for treatment 
differences in clinical trials, section [5] briefly discusses the case of testing for 
non-inferiority of one of the two treatments. In had been noted in another 
paper by Friede and Kieser [13] that type I error inflations from SSR can be 
more marked in this situation. We give an explanation of this phenomenon. 

2 A scenario leading to type I error violation 

In this section we show that in certain blinded sample size review as 

suggested by [T] leads to a type I error which is larger than the nominal level 
a. 
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In general, blinded sample review is characterized by the fact that the 
final sample size of the study may be changed at interim analyses, but that 
this change depends on the data only via the total variance which is the 
variance estimate under the null hypothesis of interest. If x^i = 1, . . . ,n\ 
are stochastically independent normally distributed observations, this total 
variance is proportional to Y17=i x 1 lTY ^he one-sample and to Y17=i x 1 ~ n i% 2 
in the two-sample case. 

We consider the one-sample t test of H : fi = at level a applied to 
Xi ~ N(fi, a 2 ). The reason for this is simplicity of notation and the fact that 
the geometric considerations given below cannot be imagined for the two- 
sample case which would have to deal with a dimension larger than three even 
in the simplest setup. However, the restriction to the one-sample case entails 
no loss of generality, as it is conceptually the same as the two sample case. 
We will briefly comment on this further below. In addition, a blinded sample 
size review may also be of practical relevance in the one-sample situation, for 
example in cross-over trials. 

Assume a blinded sample size review after n\ = 2 observations. If the 
total variance is small, we stop sampling and test with the n\ = n = 2 
observations we have obtained. If it is large, we take another sample element 
£3, and do the test with n = 3 observations. This rule implies that n = 2 
for x\ + x\ < r 2 and n = 3 otherwise for some fixed scalar r. Geometrically, 
the rejection region of the (one-sided) t test for n = 3 is a spherical cone 
with the equiangular line x\ = X2 = £3 as its central axis in the three- 
dimensional space. By definition, the probability mass of this cone is a 
under H . For the case of n = 2, the rejection region is a segment of the circle 
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x 2 +x 2 < r 2 around the equiangular line x\ = x^- Hence, in three dimensions, 
the rejection region is a segment of the spherical cylinder x\ + x\ < r 2 ,xs 
arbitrary. The probability mass covered by this segment again is a inside the 
cylinder. The rejection region of the entire procedure is the segment of the 
cylinder plus the spherical cone minus the intersection of the cone with the 
cylinder. We now approximate the probability mass of these components. 

For r 2 small, we approximately have P(x\-\-x 2 , < r 2 ) = j^-. Hence, under 
H Q , the probability mass of this part of the rejection region is approximately 

2 

Tpj- • a. The volume of the intersection of the cone with the cylinder can 
be approximated as follows: The central axis x\ = 22 = £3 of the cone 
intersects with the cylinder in one of the points ± ^^j=, . The distance 

of this point to the origin is thus h = <J\r. The approximate volume of the 
intersection is = \/6nr 3 . To conservatively approximate the probability 
mass of this intersection, we assume that every point in it has the same 
probability mass as the origin (in reality, it of course has a lower probability 
mass). Then the probability mass of the intersection is approximated by 
^/6nr 3 ■ a ■ (\/2ira)~ 3 , where (^/2na)~ 3 is the value of the standard normal 
density A^ 3 (0, a 2 !^) in the point 0. Combining these results, a conservative 
approximation of the probability mass of the rejection region for the entire 
procedure is 

/ r 2 vW 3 \ / r 2 y/3r 3 \ 

Obviously, this is larger than a for small r. 

For the more general case of a stage- 1-sample size of ni, possibly followed 
by a stage 2 with 712 further observations, the rejection region of the "sample 
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size reviewed" t test has an approximate null probability following the same 
basic principle as flTJ): 

1 + consti ■ ( Xj-g. ) — const2 ■ f Z^ a . \ j if r, n\ and 712 are small. 
Consequently, there must be situations with small where the blinded 
review procedure cannot keep the type I error level a exactly. Due to sym- 
metry of the rejection region, this statement holds for both the one- and the 
two-sided test of Hq. 

Note that in this example, the test keeps a exactly if Y^=i x 1 — 7,2 • This 
is due to the sphericity of the conditional null distribution of (xi, ■ ■ ■ ,x ni ) 
given Y^=i x f — r2 ( see 0' theorem 2.5.8). Type I error violation stems 
from the fact that the test does not keep a conditional on Y2i=i > r 2 , i.e. 
if a second stage of sampling more observations is done. 

To investigate the magnitude of the ensuing type I error violation, we 
simulated IO'000'OOO cases with m = 2 initial observations and ri2 = 2 
additional observations that are only taken if x\ + x\ > 0.5. The true type I 
error of the two-sided combined t test turned out to be 0.0542 for a nominal 
a = 0.05. As expected, this is caused by the situations where stage-2-data 
is obtained. Since x\ + x\ ~ X 2 (2), we have P[x\ + x\ > 0.5) = 0.779. This 
was also the value observed in the simulations. The rejection rate for these 
cases alone was 0.0553. If x\ + x\ < 0.5, we know that conditionally the 
rejection rate is exactly a. Accordingly, this conditional rejection rate in the 
simulations was 0.0500. 

If rii and ri2 are increased, the true type I error rate converges rather 
quickly to a. For example, in case of n\ = 712 = 5 and r 2 = 2.5, the 
simulated error rate is 0.0508 with 77.6% of cases leading to stage 2 and a 



6 



conditional error rate of 0.0510 in case stage 2 applies. 

We also performed some simulations where n 2 is determined with the 
algorithm suggested by pp. For this purpose, we generated IO'000'OOO simu- 
lation runs of a blinded sample size review after n\ = 2 observations following 
the rule given in section 3 of [T] with a very large assumed effect of 5 = 2.2. 
This produces an average of 3.09 additional observations n 2 . The simulated 
type I error was 0.05077. 

To see that the two-sample case is also covered by these investigations, 
note that the ordinary i-test statistic can be viewed as Xj \/Y/s where 
X ~ N(S,1) is stochastically independent of Y ~ X 2 { s )- Regarding any 
investigation of the properties of this quantity, it obviously does not matter 
if the random variables X and Y arise as mean and variance estimate from 
a one-sample situation or as difference in means and common within-group 
variance estimate in the two-sample case. The same is true here: According 
to prj, p. 3575, the "resampled" i-test statistic consists of the four compo- 
nents Di, Vi, D 2 \{Vi, Di) and V 2 \(Vi, D±) (loosely speaking, these corre- 
spond to the differences in means and variance estimates of the two stages). 
Comparing the distributions of Di and V\ and the conditional distributions 
of D 2 and V 2 given Di and V\ (and hence n 2 ), one immediately sees that 
these are the same for the one- and the balanced two-sample case when we 
replace n« by rij/2 and the means of the two stages by the corresponding two 
differences in means between the two treatment groups. For the conditional 
distribution of V 2 * |(Vi,Z?i) see section HI 
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3 Approaches that control the type I error 



3.1 Permutation and rotation tests 

If the considerations from the previous section are of concern, then a sim- 
ple alternative is to do the test as a permutation test. In the one-sample 
case, one would generate all permutations (or a large number of random per- 
mutations) of the signs onto the absolute values of observations. For each 
permutation, the t test would be calculated and the (1 — a)-quantile of the 
resulting empirical distribution of t-test values gives the critical value of an 
exact level a-test of Ho. Alternatively, a p-value can be obtained by counting 
the percentage of values from the permutation distribution which are larger 
or equal to the actually observed value of the test statistic. After determin- 
ing the additional sample size ri2 from the first n\ observations, we apply the 
permutation method to all n\ + ri2 observations. The special case of ri2 = is 
possible and then the parametric (non-permutation) t-test can also be used. 
This strategy keeps the a-level exactly, because the total variance — Yli=i x 1 
is invariant to the permutations. 

In the two-sample case, the approach would permute the treatment allo- 
cations of the observations. In order to preserve the observed total variance, 
the permutations have to be done separately for the ri\ observations of stage 
1 and the n<i observations of stage 2, respectively. 

If sample sizes are small, permutation tests suffer from the discreteness 
of the resampling distribution and the associated loss of power. In this case, 
rotation tests jH |5] offer an attractive alternative. These replace the random 
permutations of the sample units by random rotations. This renders the sup- 
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port of the corresponding empirical distribution continuous and thus avoids 
the discreteness problem of the permutation strategy. In order to facilitate 
this, rotation tests require the assumption of a spherical null distribution. 
This is the case in this context. Stage- 1- and stage- 2-data have to be rotated 
separately even in the one-sample case in order to keep the fixed observed 
stage- 1-value of the total variance. 

Permutation and rotation strategies emulate the true distribution of the 
t test including sample size review. Hence, they will "automatically" correct 
any type I error inflation as outlined in the previous section, but will oth- 
erwise have almost identical properties (e.g. with respect to power) as their 
"parametric" counterpart. We did some simulations of the permutation and 
rotation strategies under null and non-null scenarios. These, however, just 
backed up the statements made here and are thus not reported. 

3.2 Combinations of test statistics from the two stages 

Methods that use a combination of test statistics from the two stages are 
another alternative if one is looking for an exact test. For example, we might 
use Fisher's p-value combination — 21og(pi ■ p 2 ) [6] where pj = P(Tj > tj) 
with Tj being the test statistic from stage-j-data only and tj its observation 
from the concrete data at hand. As — 21og(j> 1 ■ p 2 ) ~ X 2 (4) for independent 
test statistics T\ and T 2 under H , the combination p-value test rejects H if 
— 21og(pi • p 2 ) is larger than the (1 — a)-quantile from this distribution. In 
this application, we use the true null distributions of the test statistics Tj to 
determine the p-values. For example, in case of the one-sample-i-test these 
are the t-distributions T\ ~ t{n\ — 1) and T 2 ~ t(n 2 — 1). 
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The stage-2-sample size n 2 is uniquely determined by Ylili x 1- Since T\ 
is a test statistic for which Theorem 2.5.8. of [3] holds under Ho, the null 
distribution of 7\ is valid also conditionally on Y17=i x 1- As a consequence, 
T\ ~ t(n\ — 1) and T 2 ~ t(n 2 — 1) are stochastically independent under Ho 
for given Y27=i x 1- Any combination of them can be used as the test statistic 
for Ho- Of course, one still has to find critical values of the null distribution 
for the selected combination. 

The statement about the conditional null distributions of the test statis- 
tics given the total variance *YT%=\ x \ allows us to go beyond Fisher's p- value 
combination and similar methods that are combining p-values using fixed 
weights or calculate conditional error functions with an "intended" stage- 
2-sample size. The weights used to combine the two stages may also de- 
pend on the observed stage-l-data. For example, if the variance were known 
(and hence a z-test for H could be done), then the optimal (standardized) 



weights for combining the z-statistics from the two stages would be 



ni+ri2 



and yj i n the one-sample case. Hence, t comb = + yj ^p^t 2 

seems a promising candidate for a combination test statistic. The fact that 
{Tj},j = 1,2 retain their t(rij — l)-null distributions if we condition on 
sf = YH=i x i m eans that critical values for this test can be obtained from 
the distribution of the weighted sum of two stochastically independent t- 
distributed random variables with (n\ — 1) and (n 2 — 1) degrees of freedom, 
respectively. It is obvious that this is very easy with numerical integration or 
a simulation. Comparing t com b with these critical values (that depend only 
on ni and n 2 ) to decide about the rejection of Ho gives an exact level-a test. 
To investigate the performance of the introduced suggestions, we did sev- 
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eral simulations. The critical values for the one-sided one-sample test using 
tcomb were obtained by simulating l'OOO'OOO values of two independent t- 
distributions with n\ fixed and ri2 as determined by the SSR method in pQ. 
We used the "total variance" for SSR, not the "adjusted variance" which 
subtracts a constant based on the putative effect size. Nevertheless, the re- 
estimated sample size of course depends on the " assumed effect" which may 
be different from the true, unknown effect size. In the simulations, we inves- 
tigated various combinations of the true effect size [i and an assumed effect 
size S. 




Figure 1: Power of various test after sample size re-estimation 

Null simulations verified the claimed type-I-error control for the various 
adjustment methods described in this section and are thus not reported. 
Figure [1] shows the results of l'OOO'OOO simulation runs for sample sizes of 
n\ — 5 and rii = 30, a true effect size of /x = 0.2 (the standardized true 
effect size, such that the non-centrality parameter of a standard-t-test with 
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n observations would be y/n/j,) and varying values of 5 on the The 
unmodified t-test as suggested by pQ is always best. In comparison, the 
weighted t-test combination t com i> suffers from a small power loss which seems 
non-negligible only for very small stage-l-sample sizes below ri\ = 10 (where 
the type I error control of the "reviewed" t-test might be a concern). For all 
simulated scenarios with n\ = 30, the difference in power was always below 
1%. In contrast, Fisher's p-value combination typically loses 3 to 4 % of 
power when t-test power is less than 95 % and up to 7% for some scenarios 
(/i = 0.1,5 = 0.15 with power t-test 76.4%, power t-combination 76.3%, 
power p- value combination 69.6%). 

4 The distribution of Kieser and Friede's t- 
test statistic 

To investigate the type I error of the t-test after a blinded sample size review, 
Kieser and Friede [1] write the t-test statistic as a function of four components 
Di, V\, D 2 and V 2 * (see page 3575 of P) for which they derive respective 
distributions. However, the distribution of V£ given (D\, Vi) mentioned there 
is an approximation, not the exact distribution. Hence, the "actual" type I 
error rates in pQ are also approximate, possibly masking a minor type I error 
level inflation. 

The following uses the notation from pp. It shows that the conditional 
distribution of V^KVi, is not x 2 {^ n 2)- 
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Without loss of generality it can be assumed that a 2 = 1. We have 

v* = v 2 + f(I u - x 21 ) 2 + (x 12 - x 22 ) 2 ) 

ni + n 2 V v 7 v ' J 

V 2 \(V\, D\) ~ x 2 (2n 2 — 2) is obvious. It is also obvious that if we condition 
on V\ only, and suppose that this determines sample size n 2 uniquely, we 
have 

D*:= J^p- (X u -X 2i )~N(0,l), 
V ni + n 2 

such that D\ and _D 2 are stochastically independent. Thus, in this case 
D* 2 + D* 2 2 ~ X 2 (2), so if n 2 is a function of Vi, but not D x , the claim 
VJIFj ~ x 2 (2^ 2 ) holds. This was noted by [7]. 

If we condition on both V\ and Di, V 2 and (Z?*, are still independent, 
but D\ and D 2 are no longer. 

By applying a theorem on conditional normal distributions (see e.g. [8], 
page 35) and some well-known results on matrix decompositions, it can be 
shown that the true conditional distribution of V 2 * is a mixture distribution: 

where n =d " denotes "equal in distribution" and z 2 has the "rescaled" non- 
central ^-distribution 

n 1 +n 2 y ni V V 2 / J 

The assumption V 2 *\(Vi, Di) ~ x 2 (2^2) will often very closely approxi- 
mate this real distribution. 
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5 Sample size reviews when testing for non- 
inferiority 



The preceding sections have dealt with the superiority test Hq : fi = 0. 
While type I error violations in this context are extremely small, it was 
noted by [13] that more serious violations arise in the case of non-inferiority 
and equivalence testing and that these are persistent with larger sample sizes. 
This section gives an intuitive explanation for this. 

Assume that in the two-sample case, it is intended to test the non- 
inferiority hypothesis Hq : \i\ — ^ < $ on data Xijk ~ N(fij,a 2 ) where 
2 = 1,2 indexes stage, j — 1,2 treatment group, k = 1, . . . , sample unit 
and 5 is a fixed non-inferiority margin. Sample size reassessment after stage 
1 determines the stage-2-sample size via 
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n 2 = 4 ■ + V > ■ a 2 (2) 

(where u a is the a-quantile of N(0, 1), (3 is the desired power of the ordinary 
two-sample t-test and 9 is the assumed true effect difference between the 
treatments) clS cL function of the "total variance" 

j 2 ni 

= « — — Yl Yl - ^ 2 



2n x 

3=1 k=l 



with X\ = — 3=1 2^ — ~- This, however, does not correspond to a "blinded" 
sample size review of the corresponding superiority test. To see this, notice 
that the described test can also be represented as a test of H : //* — /x 2 < 
on the "shifted" data 

4.=(^:" j=1 ' (3) 

•Eijk if j 2, 
14 



A blinded sample size review of (x*- fe j would also use (j2J), but with 

2 ni 

i=i fc=i 

instead of a 2 . It is easy to see that 



2(2ni — 1) 2ni — 1 

This formula contains the quantity ^^ (xu — £12) which links the realized 
difference in means i u — x 12 with the true difference <5 of means under H . If, 
for example, 5 < 0, then n 2 decreases with increasing realized values of Xu — 
X\2- Relative to the blinded superiority sample size review, this means that 
fewer additional sample elements are taken when stage- 1-evidence is in favor 
of the alternative and vice versa. Obviously, this must be associated with 
an increase of type I error under H . Conversely, the test gets conservative 
when 5 > 0. These tendencies were also noticed by [13] in simulations. 

The "blinded" non-inferiority test is thus equivalent to an "unblinded" 
superiority test and hence subject to type I error biases that afflict an un- 
modified t-test applied after the sample size was modified using the observed 
difference in means. To be sure, the user of the blinded non- inferiority re- 
estimation does not get to see the realized value of x\\ — X12, but nevertheless 
it has the described impact on the modified sample size n 2 . 



6 Discussion 



This paper investigates a number of situations with normally distributed ob- 
servations where blinded sample size review according to Kieser and Friede 
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does not control the type I error rate. In superiority testing, the correspond- 
ing inflations are extremely small and occur with sample sizes that will rarely 
be of practical relevance. The method can thus safely be used in practice. 

As an alternative for which type I error control can be proved, it is 
also possible to combine the t-test statistics of the two stages directly us- 
ing data- dependent weights. Regarding the outcome in practical applica- 
tions, these two methods are virtually indistinguishable. In contrast, p-value- 
combination and related methods suffer from some power loss due to the fact 
that they have to work with a predetermined "intended" stage- 2-sample size 
and lose power if one deviates from this intention in the sample size review. 

Non- inferiority testing is subject to much more severe type I error vio- 
lations. This is due to its equivalence with unblinded superiority testing. 
As a consequence, blinded SSR is not an acceptable method in confirmatory 
clinical trials. 
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7 Technical Appendix 

This appendix shows that V£\(Vi, Di) has the distribution given in section 
SJ By applying the usual theorems on conditional normal distributions (see 
e.g. jH], page 35), we obtain the bivariate distribution 



D* 



(4) 



/ 



No 



\ 



To derive the distribution of D*'D* = D{ 2 
following well-known general result: 



n2 



2ni+n 2 

2(ni+n 2 ) 2(ni+n 2 ) 

n,2 2ni+n 2 

2(ni+n 2 ) 2(ni+n 2 ) 



D% 2 , we can make use of the 



Suppose x ~ N p (fi,V) and let Va be a root of V (i.e. a matrix that 
fulfills V^V^ = V). Then x =^ V^y where y ~ iV^V - ^^ I p ). 

Furthermore assume that A is a positive semidefinite symmetric p x p- 
matrix. Then 

x'Ax= d y'vUv^y. (5) 

V2AV2 can also be written as an eigenvalue decomposition C'AC, where 
A = (Aj) i=1 is the diagonal matrix of eigenvalues and C is the matrix of 
the corresponding eigenvectors. Inserting this into ([5]), we obtain 



x'Ax 



5> 



with z = {zx, . . . , z P Y ~ N (c'V-hfj,, 1^ . 
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Using this general result in the particular case by setting fi and V to 
the mean and covariance matrix in (j3J) and A = I2, it is easy to see that 
V has eigenvalues 1 and ni + ra2 and eigenvectors -7= • (1, 1)' and A= ■ (1,-1)'. 
Consequently, conditional on (Vi,Di), we obtain: 

Df + Df = d z\ + z\ 

where z x ~ N{0, 1) and z 2 ~ N(^^ ■ (A - A) , Hence, 
V^*|(Vi,£)i) =d xina-i + 5 where z\ has the "rescaled" non-central x 2 - 
distribution 

*5~- X* fl;^(fl 1 -,/|AV). 

n x + n 2 \ V V 2 / y 

We note in passing that if D* D* were x 2 (2)-distributed, it would have 
E(D*'~D*) = 2. The true conditional expected value given (Vi,Di) can be 
obtained from 

£(D*'D*) = tr(E(B*B*')) = tr(£ D , + Md*Md*) = 

i + -^- + ^-(d 1 -Ma) 3 . 

ni+n 2 n 1 +n 2 \ V 2 / 

Of course, this is not equal to 2 in general. However, E((Di — y^A) 2 ) = 
1 holds, since L>! ~ N (v^A, 1). If we then ignore that n 2 is a random 
variable as well, we obtain the approximate unconditional expected value 
(l + + - J2 r~) = 2 - 

^ ni+n.2 ni+n2 J 
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